GloMPO (Globally Managed Parallel Optimization): a tool for expensive, black-box optimizations, application to ReaxFF reparameterizations

In this work we explore the properties which make many real-life global optimization problems extremely difficult to handle, and some of the common techniques used in literature to address them. We then introduce a general optimization management tool called GloMPO (Globally Managed Parallel Optimization) to help address some of the challenges faced by practitioners. GloMPO manages and shares information between traditional optimization algorithms run in parallel. We hope that GloMPO will be a flexible framework which allows for customization and hybridization of various optimization ideas, while also providing a substitute for human interventions and decisions which are a common feature of optimization processes of hard problems. GloMPO is shown to produce lower minima than traditional optimization approaches on global optimization test functions, the Lennard-Jones cluster problem, and ReaxFF reparameterizations. The novel feature of forced optimizer termination was shown to find better minima than normal optimization. GloMPO is also shown to provide qualitative benefits such a identifying degenerate minima, and providing a standardized interface and workflow manager. Supplementary Information The online version contains supplementary material available at 10.1186/s13321-022-00581-z.

x 2 i − 10 cos (2πx i ) (S1) where d = 66 −5.12 < x i < 5.12 x min = {0, 0, . . . , 0} d f Rastrigin (x min ) = 0 Source: Hoffmeister and Bäck [1] Deceptive x min = α f Deceptive (x min ) = −1 Source: Saud and Mohamed [3] Schwefel where d = 20 −500 < x i < 500 x min = {420.9687, 420.9687, . . . , 420.9687} d f Schwefel (x min ) = −418.9829d Source: Karaboga and Basturk [2] Shubert x min many degenerate locations f Shubert (x min ) = −39 303.550 054 3 Source: Wang et al [5] S2 Management algorithm A detailed algorithm of the GloMPO management loop, and how it uses each of the customizable classes is given in Algorithm S1. The first step of the process is to fill the available worker slots by starting new optimizers with StartNewWorkers which uses the selector to decide which of the available optimizers to start and the generator to select an initial starting point. Note that optimizers consists of optimizer types or blueprints not actual instances.
GloMPO implements two communication channels between the manager and its children. The first is a shared queue between all children. The queue is unidirectional and centralizes all the function evaluation results. This ensures that functions are collected in the order in which they are evaluated, and, by using a size-limited queue, fast-functions can be throttled to remain in sync with the manager. The second channel is a bidirectional pipe through which the child can message the manager, and the manager can control the child. This channel ensures that an open connection is always available between the child and manager even when the queue is full. Overall, this system is more robust than using a single channel as different types of information transfer can be handled in distinct ways.
CheckOptimizerMessages polls each worker for messages about its status (i.e. checking if it has converged or crashed). This call will also perform all the necessary post-processing and cleaning-up should the worker have ended.
The manager then waits on the queue for results from the workers for a defined period of time (t). If new results are ready, these are extracted from the workers and added to the result log, from which a new best solution is further extracted. This log will then be examined to determine if the hunter condition is satisfied for any of the optimizers. If so, such optimizers are terminated (lines 13 to 16).
At this point (line 18) the manager shares the incumbent solution with all its children. They may use this information to better guide their explorations if configured to do so.
InspectOptimizers is a routine which checks the status of the workers to ensure they have not stalled, ended without sending feedback to the manager, or failed to respond to control signals. Any transgressors are shutdown and cleaned-up. This routine is distinct from CheckOptimizerMessages since it handles irregular or unexpected behavior.
Finally, CheckConvergence determines if the checker condition for GloMPO termination is satisfied. If so, the loop exits, all workers are shutdown, and the best solution is returned.

S3.1 Basin-hopping
Algorithm S2 is a simplified procedure for the basin-hopping algorithm as implemented in SciPy v1.2.1 [4]. Algorithm S3 details the procedure of the 'basin-hopping generator' used in Test B. The generator algorithm follows the process of the actual basin-hopping algorithm as closely as possible with some minor tweaks to allow for running local optimization in asynchronous parallel.
The optimizer routine can be broadly divided into four parts: 1 Adjust step size based on acceptance rate; 2 Randomly perturb the current location (limited by a maximum step size); 3 Run a local minimization from the new point; 4 Accept/reject the minimization's result as the starting point for new steps based on a Metropolis-Hastings test. The generator follows the same procedures but the order is altered since its output is before point 3 i.e. it must return the new point for optimization by one of the manager's children. At first the generator returns random points until the manager has accumulated some results. Thereafter, the location from which the step is taken is the best point seen thus far by the manager from any child. This is reasonable since steps are most likely taken from the best point in the optimizer algorithm. Occationally, the step is taken from a worse point. To mimic this, the best point Algorithm S1 Basic loop of GloMPO management structure is converged ←False 4: workers ←empty list 5: results log ←empty list 6: while not is converged do 7: workers ← workers+StartNewWorkers(f (·), selector, optimizers, generator) 8: CheckOptimizerMessages(workers) 9: wait new results ←GetResults(workers) timeout t 10: if new results then 11: results log ← results log + new results 12: result ←min(results log) 13: for all worker in workers do 14: kill worker ←HuntOptimizer(hunter, results log, worker) 15: if kill worker then 16: ShutdownWorker ( Algorithm S2 Simplified basin-hopping optimizer 1: procedure BHOptimizer(f (·), x 0 , imax, interval, target, ∆, f actor, T ) 2: x ← x 0 3: if i mod interval is 0 then ▷ Adjust step size (∆) every interval iterations. 7: accept rate ← naccept/i 8: if accept rate > target then 9: ∆ ← ∆/f actor 10: else 11: ∆ ← ∆ · f actor 12: end if 13: end if 14: step ←RandomVector(−∆, ∆) ▷ Uniform random perturbation in each dimension. 15: x trail ← x + step 16: x trail , y trail ← BFGS(x 0 = x trail ) ▷ Local optimization returning new minimum. 17: accept ← Metropolis(y, y trail , T ) ▷ Metropolis-Hastings accept/reject test. 18: if accept then 19: x, y ← x trail , y trail 20: naccept ← naccept + 1 21: end if 22: end for 23: return x, y 24: end procedure from another random child is chosen and accepted as the starting point with a Metropolis-Hastings test.
The generator routine can thus be summarized as follows: 1 Identify best location seen thus far; 2 Accept/reject a move to another good point explored by another child based on a Metropolis-Hastings test; 3 Adjust step size based on acceptance rate; 4 Randomly perturb the current location (limited by a maximum step size); 5 Return the new perturbed point.
Something should be mentioned about the step-size. This is a single parameter which grows and shrinks based on the performance of the main chain, it is poorly suited to taking steps from other points when they are selected. One could develop the algorithm to have another step-size parameter for these special jumps, but we have not done so here in order to keep the generator close to its parent optimizer. If a special step results in finding a new best point, such that most new optimizers are started from there, starting a new main chain, then the step size will adapt with time to the new region.

S3.2 Dual annealing
Algorithms S4 show the simplified pseudocode for the dual annealing optimizer. In this case the algorithms match almost exactly, and our generator implementation mostly calls internal SciPy code. When the generator is called, it runs the same steps as the optimizer from lines 8 to 34. The core difference is that the generator must result in a location for a local search unlike the optimizer. Thus, the generator performs a check at this point to see if x is different from the last time the generator was called. If true, the new x is returned, otherwise the annealing step is repeating. The annealing can be repeated a maximum of five times before the temperature is reset and x is moved to a random location.

22:
y visit ← f (x visit ) 23: if y visit < y then 24: x, y ← x visit , y visit 25: if y visit < y best then 26: x best , y best ← x visit , y visit 27: end if 28: else 29: accept ← Metropolis(y visit , y, qa) ▷ Accept moves to worse locations based on a Metropolis-Hastings test. 30: if accept then 31: x, y ← x visit , y visit 32: end if 33: end if 34: end for 35: run local ← CheckPerformance ▷ Local search based on various factors like the success of the annealing steps. 36: if run local then 37: x, y ← BFGS(x 0 = x) 38: if y < y best then 39: x best , y best ← x,  • 'Conv.', ns and ng refer to the convergence settings, number of serial optimizers, and number of GloMPO optimizers used respectively (see Table 2). • 100 repeats were run for every set.
• ns and ng refer to the number of serial and GloMPO optimizers used respectively (see Table 2). • 100 repeats were run for every set. The local optimizers had a convergence tolerance of 10 −5 . • 'Conv.', ns and ng refer to the convergence settings, number of serial optimizers, and number of GloMPO optimizers used respectively (see Table 2). • 10 repeats were run for every set. • 'Loose' and 'Strict' hunting configurations refer to how aggressively GloMPO shutdown child optimizers. A 'Loose' hunting style allowed optimizer to remain alive for longer, well into the focus phase. 'Strict' hunting terminated the optimizers as soon as they began to appear to focus.